add models #42

BoyuanFeng · 2025-06-22T22:34:49Z

This PR adds more benchmarks to ci. After this PR, we cover the following settings:

huydhn · 2025-07-21T20:52:19Z

vllm-benchmarks/benchmarks/cuda/serving-tests.json

@@ -99,7 +183,112 @@
        }
    },
    {
-        "test_name": "serving_llama4_maverick_fp8_tp8",
+        "test_name": "serving_llama4_scout_tp4_random_in200_out200",


Note: we need to have a better way to separate these cases with different input/output shapes on the dashboards. At the moment, there are only tensor parallel size and request rate https://hud.pytorch.org/benchmark/llms?repoName=vllm-project%2Fvllm

cc @yangw-dev if you have time to pick this up

huydhn

Thank you for adding these model!

add gemma-3-27b-it and qwen3_30B-A3B

539b9fd

facebook-github-bot added the cla signed label Jun 22, 2025

BoyuanFeng temporarily deployed to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Error

BoyuanFeng had a problem deploying to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Inactive

BoyuanFeng temporarily deployed to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Error

BoyuanFeng temporarily deployed to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 02:04 — with GitHub Actions Inactive

nit

3d110dc

BoyuanFeng changed the title ~~add gemma-3-27b-it and qwen3_30B-A3B~~ add models Jul 20, 2025

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Error

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 21, 2025 20:46 — with GitHub Actions Failure

huydhn reviewed Jul 21, 2025

View reviewed changes

huydhn approved these changes Jul 21, 2025

View reviewed changes

BoyuanFeng merged commit 60beea6 into main Jul 21, 2025
42 of 48 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add models #42

add models #42

BoyuanFeng commented Jun 22, 2025 •

edited

Loading

Uh oh!

huydhn Jul 21, 2025

Uh oh!

huydhn left a comment

Uh oh!

Uh oh!

Uh oh!

add models #42

add models #42

Conversation

BoyuanFeng commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

huydhn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

BoyuanFeng commented Jun 22, 2025 •

edited

Loading